Search CORE

1,456 research outputs found

Incentivizing Exploration with Heterogeneous Value of Money

Author: A Slivkins
HE Robbins
JC Gittins
JC Gittins
M Spence
Michael N. Katehakis
P Auer
P Whittle
S Boyd
TL Lai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/12/2015
Field of study

Recently, Frazier et al. proposed a natural model for crowdsourced exploration of different a priori unknown options: a principal is interested in the long-term welfare of a population of agents who arrive one by one in a multi-armed bandit setting. However, each agent is myopic, so in order to incentivize him to explore options with better long-term prospects, the principal must offer the agent money. Frazier et al. showed that a simple class of policies called time-expanded are optimal in the worst case, and characterized their budget-reward tradeoff. The previous work assumed that all agents are equally and uniformly susceptible to financial incentives. In reality, agents may have different utility for money. We therefore extend the model of Frazier et al. to allow agents that have heterogeneous and non-linear utilities for money. The principal is informed of the agent's tradeoff via a signal that could be more or less informative. Our main result is to show that a convex program can be used to derive a signal-dependent time-expanded policy which achieves the best possible Lagrangian reward in the worst case. The worst-case guarantee is matched by so-called "Diamonds in the Rough" instances; the proof that the guarantees match is based on showing that two different convex programs have the same optimal solution for these specific instances. These results also extend to the budgeted case as in Frazier et al. We also show that the optimal policy is monotone with respect to information, i.e., the approximation ratio of the optimal policy improves as the signals become more informative.Comment: WINE 201

arXiv.org e-Print Archive

Crossref

On the Prior Sensitivity of Thompson Sampling

Author: BC May
D Russo
D Russo
E Kaufmann
J Bartroff
N Cesa-Bianchi
P Auer
S Bubeck
SL Scott
TL Lai
W Thompson
Publication venue
Publication date: 20/07/2016
Field of study

The empirically successful Thompson Sampling algorithm for stochastic bandits has drawn much interest in understanding its theoretical properties. One important benefit of the algorithm is that it allows domain knowledge to be conveniently encoded as a prior distribution to balance exploration and exploitation more effectively. While it is generally believed that the algorithm's regret is low (high) when the prior is good (bad), little is known about the exact dependence. In this paper, we fully characterize the algorithm's worst-case dependence of regret on the choice of prior, focusing on a special yet representative case. These results also provide insights into the general sensitivity of the algorithm to the choice of priors. In particular, with

p

being the prior probability mass of the true reward-generating model, we prove

O(\sqrt{T/p})

and

O(\sqrt{(1-p)T})

regret upper bounds for the bad- and good-prior cases, respectively, as well as \emph{matching} lower bounds. Our proofs rely on the discovery of a fundamental property of Thompson Sampling and make heavy use of martingale theory, both of which appear novel in the literature, to the best of our knowledge.Comment: Appears in the 27th International Conference on Algorithmic Learning Theory (ALT), 201

arXiv.org e-Print Archive

Crossref

The determinants of longevity: The perspectives from East Asian economies

Author: Chiang TL
Hashimoto H
Kim CY
Lai ETC
Marmot M
Woo J
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 22/05/2023
Field of study

UCL Discovery

Bandit Models of Human Behavior: Reward Processing in Mental Disorders

Author: A Dezfouli
AD Redish
AM Taylor
D Bouneffouf
D Bouneffouf
DC Perry
LE Hess
M Luman
MJ Frank
P Auer
P Auer
P Auer
TL Lai
TU Hauser
W Thompson
WR Thompson
WW Seeley
Publication venue
Publication date: 07/06/2017
Field of study

Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. We demonstrate empirically that the proposed parametric approach can often outperform the baseline Thompson Sampling on a variety of datasets. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions.Comment: Conference on Artificial General Intelligence, AGI-1

arXiv.org e-Print Archive

Crossref

Gli astronomi romani e i loro strumenti. Christiaan Huygens di fronte agli estimatori e detrattori romani delle osservazioni di Saturno (1655-1665)

Author: Bae S
Harrison S
Killingsworth M
Lai K
Ling SCW
Nikesitch N
Roberts TL
Tao C
Wang M
Publication venue: École Française de Rome
Publication date: 01/01/2008
Field of study

Unitus DSpace

PubMed Central

UNSWorks

University of Melbourne Institutional Repository

University of Queensland eSpace

Limit theorems for delayed sums of random sequence

Author: A Zygmund
Ding Fang-qing
PY Chen
TL Lai
W Liu
Wang Zhong-zhi
YS Chow
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Understanding the Spatial Clustering of Severe Acute Respiratory Syndrome (SARS) in Hong Kong

Author: A.J. Hedley
C.M. Wong
G.M. Leung
Haggett P
J. Kong
Lloyd OL
Nyerges TL
P.C. Lai
P.Y. Leung
S.V. Lo
Publication venue: National Institue of Environmental Health Sciences
Publication date: 01/01/2004
Field of study

We applied cartographic and geostatistical methods in analyzing the patterns of disease spread during the 2003 severe acute respiratory syndrome (SARS) outbreak in Hong Kong using geographic information system (GIS) technology. We analyzed an integrated database that contained clinical and personal details on all 1,755 patients confirmed to have SARS from 15 February to 22 June 2003. Elementary mapping of disease occurrences in space and time simultaneously revealed the geographic extent of spread throughout the territory. Statistical surfaces created by the kernel method confirmed that SARS cases were highly clustered and identified distinct disease “hot spots.” Contextual analysis of mean and standard deviation of different density classes indicated that the period from day 1 (18 February) through day 16 (6 March) was the prodrome of the epidemic, whereas days 86 (15 May) to 106 (4 June) marked the declining phase of the outbreak. Origin-and-destination plots showed the directional bias and radius of spread of superspreading events. Integration of GIS technology into routine field epidemiologic surveillance can offer a real-time quantitative method for identifying and tracking the geospatial spread of infectious diseases, as our experience with SARS has demonstrated

Crossref

PubMed Central

HKU Scholars Hub

Molecular Identification of Spirometra erinaceieuropaei Tapeworm in Cases of Human Sparganosis, Hong Kong

Author: Chan HS
Cheung YF
Lai CK
Poon RW
Poon TL
Tang TH
Tang WL
Tsang YP
Wong SSY
Wu AK
Wu TC
Publication venue: 'Centers for Disease Control and Prevention (CDC)'
Publication date: 01/01/2017
Field of study

Human sparganosis is a foodborne zoonosis endemic in Asia. We report a series of 9 histologically confirmed human sparganosis cases in Hong Kong, China. All parasites were retrospectively identified as Spirometra erinaceieuropaei. Skin and soft tissue swelling was the most common symptom, followed by central nervous system lesions.published_or_final_versio

HKU Scholars Hub

Treatment of severe acute respiratory syndrome with lopinavir/ritonavir: A multicentre retrospective matched cohort study

Author: Chan KS
Chu CM
Lai ST
Peiris JSM
Que TL
Sung J
Tam CY
Tse MW
Tsui E
Wong MML
Wong VCW
Yuen KY
Publication venue: 'The University of Hong Kong Libraries'
Publication date: 01/01/2003
Field of study

Objectives. To investigate the possible benefits and adverse effects of the addition of lopinavir/ritonavir to a standard treatment protocol for the treatment of severe acute respiratory syndrome. Design. Retrospective matched cohort study. Setting. Four acute regional hospitals in Hong Kong. Patients and methods. Seventy-five patients with severe acute respiratory syndrome treated with lopinavir/ritonavir in addition to a standard treatment protocol adopted by the Hospital Authority were matched with controls retrieved from the Hospital Authority severe acute respiratory syndrome central database. Matching was done with respect to age, sex, the presence of co-morbidities, lactate dehydrogenase level and the use of pulse steroid therapy. The 75 patients treated with lopinavir/ritonavir were divided into two subgroups for analysis: lopinavir/ritonavir as initial treatment, and lopinavir/ritonavir as rescue therapy. These groups were compared with matched cohorts of 634 and 343 patients, respectively. Outcomes including overall death rate, oxygen desaturation, intubation rate, and use of pulse methylprednisolone were reviewed. Results. The addition of lopinavir/ritonavir as initial treatment was associated with a reduction in the overall death rate (2.3%) and intubation rate (0%), when compared with a matched cohort who received standard treatment (15.6% and 11.0% respectively, P<0.05) and a lower rate of use of methylprednisolone at a lower mean dose. The subgroup who had received lopinavir/ritonavir as rescue therapy, showed no difference in overall death rate and rates of oxygen desaturation and intubation compared with the matched cohort, and received a higher mean dose of methylprednisolone. Conclusion. The addition of lopinavir/ritonavir to a standard treatment protocol as an initial treatment for severe acute respiratory syndrome appeared to be associated with improved clinical outcome. A randomised double-blind placebo-controlled trial is recommended during future epidemics to further evaluate this treatment.published_or_final_versio

HKU Scholars Hub

Prospective modelling of environmental dynamics. A methodological comparison applied to mountain land cover changes

Author: C Kooperberg
CM Bishop
G Selleron
H White
K Hornik
K Hornik
M Paegelow
M Paegelow
N Villa
RR Yager
T Lai
T Mezzadri-Centeno
TL Saaty
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

During the last 10 years, scientists performed significant advances in modelling environmental dynamics. A wide range of new methodological approaches in geomatics - such as neural networks, multi-agent systems or fuzzy logics - was developed. Despite these progresses, the modelling softwares available have to be considered as experimental tools rather than as improved procedures able to work for environmental management or decision support. Particularly, the authors consider that a large number of publications suffer from lakes in the validation of the model results. This contribution describes three different modelling approaches applied to prospective land cover prediction. The first one, a combined geomatic method, uses Markov chains for temporal transition prediction while their spatial assignment is supervised manually by the construction of suitability maps. Compared to this directed method, the two others may be considered as semi automatic because both the polychotomous regression and the multilayer perceptron only need to be optimized during a training step - the algorithms detect themselves the spatial-temporal changes in land cover. The authors describe the three methodological approaches and their practical applications to two mountain studied areas: one in French Pyrenees, the second including a large part of Sierra Nevada, Spain. The article focuses on the comparison of results. The main result is that prediction scores are on the more high that land cover is persistent. They also underline that the geomatic model is complementary to the statistical ones which perform higher overall prediction rate but produce worse simulations when land cover changes are numerous

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

HAL-INSA Toulouse